import course staff management command #255

varshamenon4 · 2024-03-05T15:52:22Z

Description: Create a management command to populate the CourseStaffRole model and any related User objects with data from the LMS. Once merged create an ESRE ticket to upload the CSV file and run the command.

varshamenon4

Just wanted to put this draft out for initial review to get feedback and to make sure I'm on the right track. Also, I'm guessing for testing that I can just create a CSV and test locally that this populates the db in django admin?

edx_exams/apps/core/management/commands/bulk_add_course_staff.py

zacharis278 · 2024-03-21T15:17:28Z

edx_exams/apps/core/management/commands/test/test_bulk_add_course_staff.py

+LOGGER_NAME = 'edx_exams.apps.core.management.commands.bulk_add_course_staff'
+
+
+class TestBulkAddCourseStaff(TestCase):


I think it would be worth adding a test here that ensures this command produces the expected number or queries w/ assertNumQueries. example:

edx-exams/edx_exams/apps/lti/tests/test_views.py

Line 716 in a6bedee

with self.assertNumQueries(4):

That makes sense! When I used this method, I see that there are a number of calls: https://github.com/edx/edx-exams/actions/runs/8394389300/job/22991407617?pr=255. Is it better to just check relative number of queries (for example, that batch size affects the number of queries for example) or is there a better way to finetune? I couldn't figure out a good way to constrain to only a certain type of query (for example, just the bulk query). Was poking around here: https://docs.djangoproject.com/en/5.0/topics/testing/tools/

You can probably just make this it's own test rather than baking it into the others as an extra assertion. For that one I'd try a bunch of extra rows to ensure the query count has not increased in an unexpected way based on more data.

For getting the expected number you should be able to walk though the code and count things up for what you expect. If it's different, it would be good to know why or where the extra query is coming from. There's a few ways to debug django queries but an easy one is to use

from django.db import connection print(connection.queries)

varshamenon4

Ready for review! Added some specific questions about batch testing and also testing using assertNumQueries.

varshamenon4 · 2024-03-22T17:59:54Z

edx_exams/apps/core/management/commands/bulk_add_course_staff.py

+        parser.add_argument(
+            '--batch_size',
+            type=int,
+            default=10000,


Is this a good default size?

as a default I think maybe we keep the number lower in the hundreds or 1k max. Not sure when we'd start running into memory issues with a really big bulk create query. You could play with this locally to get an estimate however.

edx_exams/apps/core/management/commands/bulk_add_course_staff.py

edx_exams/apps/core/management/commands/test/test_bulk_add_course_staff.py

zacharis278

still looking over things but a few comments to start

zacharis278 · 2024-03-22T19:50:27Z

edx_exams/apps/core/management/commands/bulk_add_course_staff.py

+        parser.add_argument(
+            '--batch_size',
+            type=int,
+            default=10000,


as a default I think maybe we keep the number lower in the hundreds or 1k max. Not sure when we'd start running into memory issues with a really big bulk create query. You could play with this locally to get an estimate however.

edx_exams/apps/core/management/commands/bulk_add_course_staff.py

zacharis278 · 2024-03-22T20:05:36Z

edx_exams/apps/core/management/commands/test/test_bulk_add_course_staff.py

+LOGGER_NAME = 'edx_exams.apps.core.management.commands.bulk_add_course_staff'
+
+
+class TestBulkAddCourseStaff(TestCase):


You can probably just make this it's own test rather than baking it into the others as an extra assertion. For that one I'd try a bunch of extra rows to ensure the query count has not increased in an unexpected way based on more data.

For getting the expected number you should be able to walk though the code and count things up for what you expect. If it's different, it would be good to know why or where the extra query is coming from. There's a few ways to debug django queries but an easy one is to use

from django.db import connection print(connection.queries)

zacharis278 · 2024-03-25T19:30:21Z

edx_exams/apps/core/management/commands/test/test_bulk_add_course_staff.py

+            csv = self._write_test_csv(csv, lines)
+            with self.assertNumQueries(2):
+                call_command(self.command, f'--csv_path={csv.name}')
+                assert CourseStaffRole.objects.filter(course_id='course-v1:edx+test+f20').count() == 1


it would be better to test if the correct user/role was created here rather than assert any role exists. Same for the below tests

edx_exams/apps/core/management/commands/test/test_bulk_add_course_staff.py

MichaelRoytman

This looks good! I left a few, hopefully quick, questions for you.

edx_exams/apps/core/management/commands/bulk_add_course_staff.py

MichaelRoytman · 2024-03-28T13:39:21Z

edx_exams/apps/core/management/commands/bulk_add_course_staff.py

+        """
+        Add the given set of course staff provided in csv
+        """
+        reader = list(unicodecsv.DictReader(csv_file))


What made you choose the unicodecsv package? I ask because Python's native csv module has a DictReader class that should provide the same functionality. Looking at the docs for unicodecsv, it looks like it improves upon some unicode issues in Python 2, but we're on Python 3 here, so I'd imagine we don't benefit from unicodecsv. Where possible, I think it would be better to leverage native packages. I think this should be an easy swap without any impact to the command or its tests. What do you think?

As a nit, at least for the csv.DictReader class, the class is iterable, so you don't have to cast the value to a list; you can just iterate the instance. Maybe unicodecsv.DictReader is different? Oh, I see. You're casting it to a list because you're iterating reader twice and don't want to exhaust the iterable, right?

Ah, good point! I've changed to use csv. The only reason I used unicode was because I had referenced code from elsewhere... so not really a good reason haha.

I primarily cast to a list because I check the length of the reader (in order to batch the creates for the course staff). Lemme know if this is a bad idea though!

edx_exams/apps/core/management/commands/bulk_add_course_staff.py

MichaelRoytman · 2024-03-28T13:52:33Z

edx_exams/apps/core/management/commands/bulk_add_course_staff.py

+        for i in range(0, len(reader), batch_size):
+            CourseStaffRole.objects.bulk_create(
+                CourseStaffRole(
+                    user=User.objects.get(


I'm not sure about the answer to this question, and maybe this is not possible. But I noticed that you create the users_existing set querying the User model via the username field. Here, though, you're querying the User model via the username and the email field. I'm wondering whether a scenario is possible where a user's would be found in users_existing but not in this query.

If we're pulling this data from the LMS, I think this could happen, because email is modifiable, but username isn't. I think email is maintained on our User model via syncing with the JWT when they hit edx-exams in the process of an exam, right? Do you think this is something we need to address here? I'm just not sure this code guarantees a user will exist, which could cause an exception here.

Since the function is in a transaction, it should be okay, because it'll rollback, we can correct the CSV, and try again, but it would be better to handle explicitly.

Oooh, that's a really good point. Since the username is unique, could I just get by username instead? I've updated to be that way but let me know if that could cause any issues.

I think that's good! username should not change.

edx_exams/apps/core/management/commands/test/test_bulk_add_course_staff.py

MichaelRoytman

This looks great! Thank you for incorporating my feedback.

zacharis278

🚀

varshamenon4 commented Mar 5, 2024

View reviewed changes

edx_exams/apps/core/management/commands/bulk_add_course_staff.py Outdated Show resolved Hide resolved

zacharis278 reviewed Mar 5, 2024

View reviewed changes

edx_exams/apps/core/management/commands/bulk_add_course_staff.py Outdated Show resolved Hide resolved

zacharis278 reviewed Mar 19, 2024

View reviewed changes

edx_exams/apps/core/management/commands/bulk_add_course_staff.py Outdated Show resolved Hide resolved

edx_exams/apps/core/management/commands/bulk_add_course_staff.py Outdated Show resolved Hide resolved

zacharis278 reviewed Mar 21, 2024

View reviewed changes

varshamenon4 force-pushed the varshamenon4/import-course-staff-mgmt-cmd branch from 0d5b026 to 901ebb0 Compare March 22, 2024 17:36

varshamenon4 marked this pull request as ready for review March 22, 2024 17:36

varshamenon4 commented Mar 22, 2024

View reviewed changes

zacharis278 reviewed Mar 22, 2024

View reviewed changes

zacharis278 reviewed Mar 25, 2024

View reviewed changes

varshamenon4 force-pushed the varshamenon4/import-course-staff-mgmt-cmd branch 2 times, most recently from 38b57fe to b0a9515 Compare March 28, 2024 12:53

MichaelRoytman reviewed Mar 28, 2024

View reviewed changes

varshamenon4 force-pushed the varshamenon4/import-course-staff-mgmt-cmd branch from 5c86af2 to e4f5931 Compare March 29, 2024 16:08

MichaelRoytman approved these changes Apr 2, 2024

View reviewed changes

feat: add course staff mgmt command

195a50e

varshamenon4 force-pushed the varshamenon4/import-course-staff-mgmt-cmd branch from 87fdab0 to 195a50e Compare April 2, 2024 16:53

zacharis278 approved these changes Apr 2, 2024

View reviewed changes

varshamenon4 merged commit cae41d6 into main Apr 2, 2024
7 checks passed

varshamenon4 deleted the varshamenon4/import-course-staff-mgmt-cmd branch April 2, 2024 17:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

import course staff management command #255

import course staff management command #255

varshamenon4 commented Mar 5, 2024

varshamenon4 left a comment

zacharis278 Mar 21, 2024

varshamenon4 Mar 22, 2024

zacharis278 Mar 22, 2024

varshamenon4 left a comment

varshamenon4 Mar 22, 2024

zacharis278 Mar 22, 2024

zacharis278 left a comment

zacharis278 Mar 22, 2024

zacharis278 Mar 22, 2024

zacharis278 Mar 25, 2024

MichaelRoytman left a comment

MichaelRoytman Mar 28, 2024

varshamenon4 Mar 29, 2024

MichaelRoytman Mar 28, 2024

varshamenon4 Mar 29, 2024

MichaelRoytman Apr 2, 2024

MichaelRoytman left a comment

zacharis278 left a comment

		LOGGER_NAME = 'edx_exams.apps.core.management.commands.bulk_add_course_staff'


		class TestBulkAddCourseStaff(TestCase):

import course staff management command #255

import course staff management command #255

Conversation

varshamenon4 commented Mar 5, 2024

varshamenon4 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

varshamenon4 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zacharis278 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MichaelRoytman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MichaelRoytman left a comment

Choose a reason for hiding this comment

zacharis278 left a comment

Choose a reason for hiding this comment